Skip to content

[JENKINS-75675] Refactor class loading logic in order to reduce memory consumption#10659

Merged
basil merged 21 commits intojenkinsci:masterfrom
dukhlov:feature/JENKINS-75675
Aug 15, 2025
Merged

[JENKINS-75675] Refactor class loading logic in order to reduce memory consumption#10659
basil merged 21 commits intojenkinsci:masterfrom
dukhlov:feature/JENKINS-75675

Conversation

@dukhlov
Copy link
Contributor

@dukhlov dukhlov commented May 19, 2025

See JENKINS-75675.

Background

Our Jenkins installation relies heavily on Groovy pipelines, shared libraries, and similar mechanisms. Groovy uses dynamic invocation via MetaClass, CallSite, etc. During introspection, it attempts to load many classes that don’t exist, simply to determine whether a given token is a class, a variable, or something else.
My investigation shows that this behavior leads to memory leakage.

Problem

The base ClassLoader implementation in Java (which all other class loaders inherit from) supports two modes:

parallelCapable (currently used by all Jenkins core class loaders, including WebAppClassLoader and PlatformClassLoader)
non-parallel (legacy mode)
Parallel-capable class loaders create and retain a lock object per class name, indefinitely. So, if we have 1,000 class loading misses, every parallel-capable class loader in the hierarchy keeps 1,000 unused lock objects forever.

In the case of Jenkins's UberClassLoader, a typical setup with ~200 plugins results in around 500 class loaders. In our Jenkins instance, about 2,000 classes are loaded successfully, while we observe over 200,000 class loading misses.

As a result, the class loaders retain:

500 class loaders × 200,000 lock objects each
Plus the internal ConcurrentHashMap.Node objects used to store them
This results in roughly 2 GB of unnecessary memory consumption.

Solution

  • Introduce base DelegatingClassLoader implementation which overrides base ClassLoaders.loadClass to remove not needed base locking mechanism (it don't load class itself just delegates loading to another class loader and locking will be done here)
  • extend DelegatingClassLoader instead of ClassLoader if possible.
  • override getClassLoadingLock() for UrlClassLoader2 for use GCed locking object which won't be kept forever
  • Introduce FilteringClassLoader which can check if class exists (using getResource() call) and don't use loadClass() call not to create locking Object at all
  • wrap base WebAppClassLoader with FilteringClassLoader to avoid not needed lock files creation

Testing done

This PR tested by running 2-hour long proprietary automation tests based on jenkins scripted pipeline jobs
class loading logic is broadly used, so it definitely covered by this testing

Also I made jcmd GC.class_histogram. Amount of java.lang.Object and java.util.concurrent.ConcurrentHashMap$Node instances was decreased dramatically

Before fix:
Screenshot 2025-05-19 at 06 26 27

After fix:
Screenshot 2025-05-26 at 19 59 19

Proposed changelog entries

  • Reduce Plugin's ClassLoader memory consumption.

Proposed changelog category

/label internal

Proposed upgrade guidelines

N/A

Submitter checklist

  • The Jira issue, if it exists, is well-described.
  • The changelog entries and upgrade guidelines are appropriate for the audience affected by the change (users or developers, depending on the change) and are in the imperative mood (see examples). Fill in the Proposed upgrade guidelines section only if there are breaking changes or changes that may require extra steps from users during upgrade.
  • There is automated testing or an explanation as to why this change has no tests.
  • New public classes, fields, and methods are annotated with @Restricted or have @since TODO Javadocs, as appropriate.
  • New deprecations are annotated with @Deprecated(since = "TODO") or @Deprecated(forRemoval = true, since = "TODO"), if applicable.
  • New or substantially changed JavaScript is not defined inline and does not call eval to ease future introduction of Content Security Policy (CSP) directives (see documentation).
  • For dependency updates, there are links to external changelogs and, if possible, full differentials.
  • For new APIs and extension points, there is a link to at least one consumer.

Desired reviewers

@jenkinsci/core-pr-reviewers

Before the changes are marked as ready-for-merge:

Maintainer checklist

  • There are at least two (2) approvals for the pull request and no outstanding requests for change.
  • Conversations in the pull request are over, or it is explicit that a reviewer is not blocking the change.
  • Changelog entries in the pull request title and/or Proposed changelog entries are accurate, human-readable, and in the imperative mood.
  • Proper changelog labels are set so that the changelog can be generated automatically.
  • If the change needs additional upgrade steps from users, the upgrade-guide-needed label is set and there is a Proposed upgrade guidelines section in the pull request title (see example).
  • If it would make sense to backport the change to LTS, a Jira issue must exist, be a Bug or Improvement, and be labeled as lts-candidate to be considered (see query).

@welcome
Copy link

welcome bot commented May 19, 2025

Yay, your first pull request towards Jenkins core was created successfully! Thank you so much!

A contributor will provide feedback soon. Meanwhile, you can join the chats and community forums to connect with other Jenkins users, developers, and maintainers.

@krisstern krisstern requested a review from a team May 19, 2025 15:39
@basil
Copy link
Member

basil commented May 19, 2025

Would this not be better reported/patched against OpenJDK's java.net.URLClassLoader? While we could work around the issue as in this PR, I'm not sure we want to maintain custom class loading logic like this in the long term. We previously spent a long time maintaining a fork of Ant's class loader, which had various issues, so I would definitely prefer to use unmodified upstream OpenJDK code if possible.

@dukhlov
Copy link
Contributor Author

dukhlov commented May 19, 2025

@basil, thank you for the quick review. I understand your concern, but I don’t believe OpenJDK is the right place for this optimization.

This isn’t an issue with URLClassLoader itself — it’s designed to work correctly in typical Java applications, where an extra 10MB per class loader is a reasonable overhead in exchange for a simpler and faster implementation. The problem becomes significant in pluggable systems that use separate class loading. So, in my view, this should be addressed within the Jenkins plugin framework.

I’ve implemented a 10-line optimization that is easy to maintain or revert if it causes any issues. This is feasible because I used Guava’s MapMaker. Guava cannot be used in the OpenJDK codebase, however.

If you consider the ~5–10MB additional memory overhead per plugin acceptable, then let’s just reject the PR.

In my production Jenkins cluster, the heap usage is around 4GB when idle — 2GB of which is consumed by 50 billion unnecessary lock objects. It works fine, but the purist in me feels that this is not how it should be.

@dukhlov
Copy link
Contributor Author

dukhlov commented May 19, 2025

Alternative: It is possible to make jenkins.util.URLClassLoader2 not capable for parallel class loading by removing

static {
   registerAsParallelCapable();
}

In this case, class loading will become serial — URLClassLoader will use a single lock instead of a dedicated lock per class name.
However, I'm not fully aware of the complete set of use cases, so I can't say for sure whether we actually need parallel class loading within the scope of a single plugin or not.

@basil
Copy link
Member

basil commented May 19, 2025

This isn’t an issue with URLClassLoader itself — it’s designed to work correctly in typical Java applications, where an extra 10MB per class loader is a reasonable overhead in exchange for a simpler and faster implementation. The problem becomes significant in pluggable systems that use separate class loading.

I don't see anything in the URLClassLoader Javadoc that would confirm this viewpoint. The opposite assertion (that it ought to be efficient in pluggable systems) could be made without contradicting the Javadoc.

I hate wasting memory and I am not against this PR as a short-term solution, but only after we have a discussion with the OpenJDK folks about whether or not this is an upstream problem and, if so, what the long-term plan would be.

@dukhlov
Copy link
Contributor Author

dukhlov commented May 19, 2025

I don't see anything in the URLClassLoader Javadoc that would confirm this viewpoint. The opposite assertion (that it ought to be efficient in pluggable systems) could be made without contradicting the Javadoc.

Right, this is just my understanding. I’ve reviewed the existing standard ClassLoader implementation and I don’t have strong arguments to ask the OpenJDK team to change a standard mechanism that already meets the needs of 99% of Java applications — just to benefit Jenkins and perhaps a few other (~10) Java-based pluggable platforms.

That said, talking to the OpenJDK folks definitely makes sense if there’s a feasible path forward.

basil
basil previously requested changes May 19, 2025
Copy link
Member

@basil basil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@basil basil added the needs-comments-resolved Comments in the PR need to be addressed before it can be considered for merging label May 19, 2025
Copy link
Member

@jtnord jtnord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry forgot to submit this earlier.

I Agree with Basil, this appears to be something for the JDK.
Anything here should be a temporary workaround until we could pickup a JDK with a fix (should the JDK team agree this is an issue)

@dukhlov dukhlov force-pushed the feature/JENKINS-75675 branch 4 times, most recently from 25e29b6 to b9045b6 Compare May 24, 2025 20:48
@dukhlov dukhlov changed the title JENKINS-75675: Use weak references for URLClassLoader2's lock storage JENKINS-75675: Refactor classloader logic in order to reduce memory consumption May 24, 2025
@dukhlov dukhlov force-pushed the feature/JENKINS-75675 branch 14 times, most recently from fc1fa28 to feb64f2 Compare May 26, 2025 03:34
Copy link
Member

@basil basil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much! Sorry for the delay in reviewing this. I just took a look and added some commits with small improvements. Feel free to revert any of them if you think it makes your PR worse.

GPT-5 suggested using a per-instance monotonically increasing ID in getClassLoadingLock to avoid identityHashCode collision, as well as precomputing a per-instance prefix once to avoid rebuilding it each time. I'll leave it up to you as to whether you think one or both of those suggestions is valid.

Can you please test the ExistenceCheckingClassLoader portion with both Tomcat and Winstone? We only officially support Winstone, but a small percentage of Jenkins users still use Tomcat, and we try to have no regressions there. If this is asking too much, I can look into this myself.

I will also run a wider set of automated tests on this PR and report the results here.

@basil basil added squash-merge-me Unclean or useless commit history, should be merged only with squash-merge skip-changelog Should not be shown in the changelog and removed needs-comments-resolved Comments in the PR need to be addressed before it can be considered for merging labels Aug 15, 2025
This reverts commit 6ad9f3b.
@dukhlov
Copy link
Contributor Author

dukhlov commented Aug 15, 2025

Thanks very much! Sorry for the delay in reviewing this. I just took a look and added some commits with small improvements. Feel free to revert any of them if you think it makes your PR worse.

GPT-5 suggested using a per-instance monotonically increasing ID in getClassLoadingLock to avoid identityHashCode collision, as well as precomputing a per-instance prefix once to avoid rebuilding it each time. I'll leave it up to you as to whether you think one or both of those suggestions is valid.

Can you please test the ExistenceCheckingClassLoader portion with both Tomcat and Winstone? We only officially support Winstone, but a small percentage of Jenkins users still use Tomcat, and we try to have no regressions there. If this is asking too much, I can look into this myself.

I will also run a wider set of automated tests on this PR and report the results here.

Thank you for review! Added one more commit with improvements you mentioned. Will perform tomcat testing...

Copy link
Member

@basil basil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! I tested this interactively and verified the memory savings as in the Testing Done section. I also ran PCT and ATH successfully. The former isn't a very realistic test since it does not use these class loaders, and the latter is realistic but plugin coverage is low. The author of the PR says it has been deployed in production successfully, so that gives me confidence these changes are correct. I see no regressions, either functional or performance-wise.

Copy link
Member

@timja timja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great solution, thanks for bearing with us throughout the process.

Thanks Basil for doing the additional testing and reproducing the results


/label ready-for-merge


This PR is now ready for merge, after ~24 hours, we will merge it if there's no negative feedback.

Thanks!

Copy link

@A1exKH A1exKH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Comment on lines +78 to +89
/**
* Replace the JDK's per-name lock map with a GC-collectable lock object.
*
* <p>Parallel-capable {@link ClassLoader} implementations keep a distinct lock object per class
* name indefinitely, which can retain huge maps when there are many misses. Returning an
* interned {@link String} keyed by this loader and the class name preserves mutual exclusion
* for a given (loader, name) pair but allows the JVM to reclaim the lock when no longer
* referenced. Interned Strings are heap objects and GC-eligible on modern JDKs (7+).
*
* @param className the binary name of the class being loaded (must not be null)
* @return a lock object unique to this classloader/class pair
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost forgot: can you please update this comment to refer to the upstream JDK issue? Any deviation from upstream in URLClassLoader2 is a temporary workaround, but based on the discussion in this thread there is a sound justification. So please explain that this code should be removed once the upstream issue (with a link) is fixed.

@basil
Copy link
Member

basil commented Aug 15, 2025

(I plan to squash-merge this with @jtnord listed as a co-author as well, because he came up with the important idea of the string interning lock.)

@basil
Copy link
Member

basil commented Aug 15, 2025

I ran the Tomcat tests from https://github.com/jenkinsci/packaging/tree/master/molecule/servlet with the WAR from this PR, and the test passed, so I think this PR is ready for merge.

@basil
Copy link
Member

basil commented Aug 15, 2025

Very impressive change. Thanks again @dukhlov! I saw that you had removed the cleanup to UberClassLoader as part of simplifying this PR for review. We'd still be interested in that cleanup if you were interested in submitting it as a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ath-successful This PR has successfully passed the full acceptance-test-harness suite internal pct-successful This PR has successfully passed the full plugin-compatibility-test suite ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback rfe For changelog: Minor enhancement. use `major-rfe` for changes to be highlighted squash-merge-me Unclean or useless commit history, should be merged only with squash-merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants